38 research outputs found
On the `Semantics' of Differential Privacy: A Bayesian Formulation
Differential privacy is a definition of "privacy'" for algorithms that
analyze and publish information about statistical databases. It is often
claimed that differential privacy provides guarantees against adversaries with
arbitrary side information. In this paper, we provide a precise formulation of
these guarantees in terms of the inferences drawn by a Bayesian adversary. We
show that this formulation is satisfied by both "vanilla" differential privacy
as well as a relaxation known as (epsilon,delta)-differential privacy. Our
formulation follows the ideas originally due to Dwork and McSherry [Dwork
2006]. This paper is, to our knowledge, the first place such a formulation
appears explicitly. The analysis of the relaxed definition is new to this
paper, and provides some concrete guidance for setting parameters when using
(epsilon,delta)-differential privacy.Comment: Older version of this paper was titled: "A Note on Differential
Privacy: Defining Resistance to Arbitrary Side Information
Spanners for Geometric Intersection Graphs
Efficient algorithms are presented for constructing spanners in geometric
intersection graphs. For a unit ball graph in R^k, a (1+\epsilon)-spanner is
obtained using efficient partitioning of the space into hypercubes and solving
bichromatic closest pair problems. The spanner construction has almost
equivalent complexity to the construction of Euclidean minimum spanning trees.
The results are extended to arbitrary ball graphs with a sub-quadratic running
time.
For unit ball graphs, the spanners have a small separator decomposition which
can be used to obtain efficient algorithms for approximating proximity problems
like diameter and distance queries. The results on compressed quadtrees,
geometric graph separators, and diameter approximation might be of independent
interest.Comment: 16 pages, 5 figures, Late
Spectral Norm of Random Kernel Matrices with Applications to Privacy
Kernel methods are an extremely popular set of techniques used for many
important machine learning and data analysis applications. In addition to
having good practical performances, these methods are supported by a
well-developed theory. Kernel methods use an implicit mapping of the input data
into a high dimensional feature space defined by a kernel function, i.e., a
function returning the inner product between the images of two data points in
the feature space. Central to any kernel method is the kernel matrix, which is
built by evaluating the kernel function on a given sample dataset.
In this paper, we initiate the study of non-asymptotic spectral theory of
random kernel matrices. These are n x n random matrices whose (i,j)th entry is
obtained by evaluating the kernel function on and , where
are a set of n independent random high-dimensional vectors. Our
main contribution is to obtain tight upper bounds on the spectral norm (largest
eigenvalue) of random kernel matrices constructed by commonly used kernel
functions based on polynomials and Gaussian radial basis.
As an application of these results, we provide lower bounds on the distortion
needed for releasing the coefficients of kernel ridge regression under
attribute privacy, a general privacy notion which captures a large class of
privacy definitions. Kernel ridge regression is standard method for performing
non-parametric regression that regularly outperforms traditional regression
approaches in various domains. Our privacy distortion lower bounds are the
first for any kernel technique, and our analysis assumes realistic scenarios
for the input, unlike all previous lower bounds for other release problems
which only hold under very restrictive input settings.Comment: 16 pages, 1 Figur
Debiasing Conditional Stochastic Optimization
In this paper, we study the conditional stochastic optimization (CSO) problem
which covers a variety of applications including portfolio selection,
reinforcement learning, robust learning, causal inference, etc. The
sample-averaged gradient of the CSO objective is biased due to its nested
structure and therefore requires a high sample complexity to reach convergence.
We introduce a general stochastic extrapolation technique that effectively
reduces the bias. We show that for nonconvex smooth objectives, combining this
extrapolation with variance reduction techniques can achieve a significantly
better sample complexity than existing bounds. We also develop new algorithms
for the finite-sum variant of CSO that also significantly improve upon existing
results. Finally, we believe that our debiasing technique could be an
interesting tool applicable to other stochastic optimization problems too